Skip to content

[cuebot] Release procs on frame-complete for scheduler-managed shows#2403

Merged
DiegoTavares merged 5 commits into
AcademySoftwareFoundation:masterfrom
DiegoTavares:fix_host_core_drift
Jun 10, 2026
Merged

[cuebot] Release procs on frame-complete for scheduler-managed shows#2403
DiegoTavares merged 5 commits into
AcademySoftwareFoundation:masterfrom
DiegoTavares:fix_host_core_drift

Conversation

@DiegoTavares

@DiegoTavares DiegoTavares commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

After #2323, the frame-complete proc-reuse path in FrameCompleteHandler still rebooked procs for scheduler-managed shows. That races the Rust scheduler, stranding pk_frame=NULL procs that hold reserved cores, so hosts report too few idle cores.

Gate the reuse path on showDao.isSchedulerManaged: for non-local procs on scheduler-managed shows, unbookProc immediately instead of queueing DispatchNextFrame. This frees the cores and avoids double-dispatching a frame the scheduler owns.

Add FrameCompleteHandlerTests coverage for both the scheduler-managed (proc released) and control (proc reused) cases.

LLM usage disclosure

Claude Opus was used to investigate and propose a fix for this issue

Summary by CodeRabbit

  • Bug Fixes

    • Procs for scheduler-managed shows are now released immediately after frame completion, preventing unnecessary retention, stranded-core checks, and redundant re-dispatch/next-frame scheduling.
  • Tests

    • Added tests confirming scheduler-managed shows release procs on terminal states and non-scheduler-managed shows continue through the existing reuse flow.

After AcademySoftwareFoundation#2323, the frame-complete proc-reuse path in FrameCompleteHandler
still rebooked procs for scheduler-managed shows. That races the Rust
scheduler, stranding pk_frame=NULL procs that hold reserved cores, so
hosts report too few idle cores.

Gate the reuse path on showDao.isSchedulerManaged: for non-local procs on
scheduler-managed shows, unbookProc immediately instead of queueing
DispatchNextFrame. This frees the cores and avoids double-dispatching a
frame the scheduler owns.

Add FrameCompleteHandlerTests coverage for both the scheduler-managed
(proc released) and control (proc reused) cases.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e64fb5af-7a50-4036-9675-f56feacc85b3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Frame complete handler now exits early for scheduler-managed shows: when a frame is WAITING or SUCCEEDED and the proc is not locally dispatched, it unbooks the proc and returns immediately, skipping stranded-core handling and re-dispatch/next-frame scheduling. Tests validate release vs reuse behavior.

Changes

Scheduler-Managed Show Proc Release

Layer / File(s) Summary
Scheduler-managed show short-circuit logic
cuebot/src/main/java/com/imageworks/spcue/dispatcher/FrameCompleteHandler.java
Handler adds early return when frame state is WAITING or SUCCEEDED and the proc belongs to a scheduler-managed show, unbooking the proc and skipping downstream stranded-core and scheduling logic.
Proc release/reuse validation tests
cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java
Test class imports ShowDao, updates Mockito static imports, adds executeProcReuseGate(boolean, FrameState) helper, and adds four transactional tests asserting that scheduler-managed shows release procs while non-scheduler-managed shows reuse them for both SUCCEEDED and WAITING states.

Sequence Diagram(s)

sequenceDiagram
  participant FrameCompleteHandler
  participant ShowDao
  participant ProcDao
  participant DispatchSupport
  FrameCompleteHandler->>ShowDao: isSchedulerManaged(showId)
  alt scheduler-managed == true
    FrameCompleteHandler->>ProcDao: unbookProc(proc, "scheduler-managed show, releasing proc")
    FrameCompleteHandler-->>DispatchSupport: return (stop further processing)
  else scheduler-managed == false
    FrameCompleteHandler->>DispatchSupport: continue stranded-core / re-dispatch flow
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • lithorus
  • ramonfigueiredo

Poem

🐰
I hop where procs are freed and clean,
Scheduler-managed shows keep the scene.
An early unbook, a gentle sigh,
Tests nod yes, no reuse nearby.
Hooray — procs released, tails held high!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: releasing procs for scheduler-managed shows during frame completion, which directly addresses the race condition preventing proc reuse.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java (2)

564-566: ⚡ Quick win

Make scheduler-managed state explicit for both test branches.

The helper currently depends on fixture defaults for the false case. Set the show flag unconditionally so the test setup is deterministic.

Suggested change
-        if (schedulerManaged) {
-            showDao.updateSchedulerManaged(showDao.getShowDetail(proc.getShowId()), true);
-        }
+        showDao.updateSchedulerManaged(
+                showDao.getShowDetail(proc.getShowId()), schedulerManaged);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java`
around lines 564 - 566, Replace the conditional that only updates
scheduler-managed when true with an unconditional call to
showDao.updateSchedulerManaged using the current schedulerManaged variable so
the test explicitly sets the flag for both branches; locate the existing if
(schedulerManaged) block around
showDao.updateSchedulerManaged(showDao.getShowDetail(proc.getShowId()), true)
and change it to always call
showDao.updateSchedulerManaged(showDao.getShowDetail(proc.getShowId()),
schedulerManaged) to make the setup deterministic.

590-604: ⚡ Quick win

Add coverage for the WAITING scheduler-managed branch.

The new handler gate applies to WAITING and SUCCEEDED, but these tests currently exercise only SUCCEEDED. Add a WAITING variant to lock in both sides of the condition.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java`
around lines 590 - 604, The tests only exercise the SUCCEEDED branch of the
handler gate; add coverage for the WAITING scheduler-managed branch by creating
new test variants that mirror testSchedulerManagedShowReleasesProc and
testNonSchedulerManagedShowReusesProc but drive the handler with a WAITING
terminal state instead of SUCCEEDED. Use the existing helper
procSurvivesReuseAfterComplete (or create a small wrapper) to invoke the same
logic for WAITING, and add assertions matching the scheduler-managed and
non-scheduler-managed expectations (assertFalse for scheduler-managed,
assertTrue for non-scheduler-managed) to lock both sides of the condition;
reference the existing test methods testSchedulerManagedShowReleasesProc and
testNonSchedulerManagedShowReusesProc and the WAITING/SUCCEEDED gate names when
locating where to add the new tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java`:
- Around line 564-566: Replace the conditional that only updates
scheduler-managed when true with an unconditional call to
showDao.updateSchedulerManaged using the current schedulerManaged variable so
the test explicitly sets the flag for both branches; locate the existing if
(schedulerManaged) block around
showDao.updateSchedulerManaged(showDao.getShowDetail(proc.getShowId()), true)
and change it to always call
showDao.updateSchedulerManaged(showDao.getShowDetail(proc.getShowId()),
schedulerManaged) to make the setup deterministic.
- Around line 590-604: The tests only exercise the SUCCEEDED branch of the
handler gate; add coverage for the WAITING scheduler-managed branch by creating
new test variants that mirror testSchedulerManagedShowReleasesProc and
testNonSchedulerManagedShowReusesProc but drive the handler with a WAITING
terminal state instead of SUCCEEDED. Use the existing helper
procSurvivesReuseAfterComplete (or create a small wrapper) to invoke the same
logic for WAITING, and add assertions matching the scheduler-managed and
non-scheduler-managed expectations (assertFalse for scheduler-managed,
assertTrue for non-scheduler-managed) to lock both sides of the condition;
reference the existing test methods testSchedulerManagedShowReleasesProc and
testNonSchedulerManagedShowReusesProc and the WAITING/SUCCEEDED gate names when
locating where to add the new tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06bf8f04-6a14-439c-8283-6087d6c9f3e0

📥 Commits

Reviewing files that changed from the base of the PR and between e504bc9 and c9d94b6.

📒 Files selected for processing (2)
  • cuebot/src/main/java/com/imageworks/spcue/dispatcher/FrameCompleteHandler.java
  • cuebot/src/test/java/com/imageworks/spcue/test/dispatcher/FrameCompleteHandlerTests.java

DiegoTavares and others added 4 commits June 9, 2026 21:08
The first version wrote b_scheduler_managed via showDao.updateSchedulerManaged,
which also primes ShowDaoJdbc's in-memory cache. That cache is not rolled back
with the transaction, so it leaked scheduler-managed behavior into other tests
(e.g. RedirectManagerTests). The control test was also nondeterministic: the
depend-job reuse path can unbook the proc downstream regardless of the show flag.

Stub isSchedulerManaged on a spy instead (no DB write, no cache mutation) and
assert the branch via the unique unbookProc reason string, restoring the spied
beans afterward. Deterministic and contamination-free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ests

The DispatchSupport/ShowDao beans are Spring JDK dynamic proxies (final
classes), so Mockito.spy() fails with "Cannot mock/spy class $Proxy".
Use mock(Interface.class, delegatesTo(realBean)) instead: it forwards all
calls to the real bean except the stubbed isSchedulerManaged, while still
recording invocations for verify().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Without a host on the FrameCompleteReport, report.getHost().getFreeMem()
returns 0, which trips the < 512MB low-memory unbook in
handlePostFrameCompleteOperations. The proc is then released at line ~454
and the method returns before reaching the WAITING/SUCCEEDED proc-reuse
branch that holds the scheduler-managed gate, so the gate was never
exercised. Attach an UP host with 8GB free so the flow reaches line 529.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The gate fires for FrameState.WAITING || SUCCEEDED but the tests only drove
SUCCEEDED. Parameterize executeProcReuseGate on the terminal FrameState and
add WAITING variants for both the scheduler-managed (release) and control
(no release) cases, locking both sides of the condition.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DiegoTavares DiegoTavares merged commit 57ab286 into AcademySoftwareFoundation:master Jun 10, 2026
58 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants